The HVT package is a collection of R functions to facilitate building topology preserving maps for rich multivariate data analysis. Tending towards a big data preponderance, a large number of rows. A collection of R functions for this typical workflow is organized below:
Data Compression: Vector Quantization (VQ), HVQ (Hierarchical Vector Quantization) using means or medians. This step compresses the rows (long data frame) using a compression objective.
Data Projection: Dimension projection of the compressed cells to 1D, 2D and Interactive surface plot with the Sammons Non-linear Algorithm. This step creates a topology preserving map (also called as mathematical embeddings) coordinates into the desired output dimension.
Tessellation: Create cells required for object visualization using the Voronoi Tessellation method, package includes heatmap plots for Hierarchical Voronoi Tessellations (HVT). This step enables data insights, visualization, and interaction with the topology preserving map. Useful for semi-supervised tasks.
Scoring: Scoring data sets and recording their assignment using the map objects from the above steps, in a sequence of maps if required.
Temporal Analysis and Visualization: A Collection of functions that leverages the capacity of the HVT package by analyzing time series data for its underlying patterns, calculation of transitioning probabilities and the visualizations for the flow of data over time.
What’s New?
The new update focuses on the integration of time series capabilities into the HVT package by extending its foundational operations to time series data which is emphasized in this vignette.
The new functionalities are introduced to analyze underlying patterns and trends within the data, providing insights into its evolution over time and also offering the capability to analyze the movement of the data by calculating its transitioning probability and creates elegant plots and GIFs.
Below are the new functions and its brief descriptions:
plotStateTransition: Provides the time series flowmap
plot.getTransitionProbability: Provides a list of transition
probabilities.reconcileTransitionProbability: Provides plots and
tables for comparing transition probabilities calculated manually and
from markovchain function.plotAnimatedFlowmap: Creates flowmaps and animations
for both self state and without self state scenarios.The Lorenz attractor is a three-dimensional figure that is generated by a set of differential equations that model a simple chaotic dynamic system of convective flow. Lorenz Attractor arises from a simplified set of equations that describe the behavior of a system involving three variables. These variables represent the state of the system at any given time and are typically denoted by (x, y, z). The equations are as follows:
\[ dx/dt = σ*(y-x) \] \[ dy/dt = x*(r -z)-y \] \[ dz/dt = x*y-β*z \] where dx/dt, dy/dt, and dz/dt represent the rates of change of x, y, and z respectively over time (t). σ, r and β are constant parameters of the system, with σ(σ = 10) controlling the rate of convection, r(r=28) controlling the difference in temperature between the convective and stable regions, and β(β = 8/3) representing the ratio of the width to the height of the convective layer. When these equations are plotted in three-dimensional space, they produce a chaotic trajectory that never repeats. The Lorenz attractor exhibits sensitive dependence on initial conditions, meaning even small differences in the initial conditions can lead to drastically different trajectories over time. This sensitivity to initial conditions is a defining characteristic of chaotic systems.
In this notebook, we will use the Lorenz Attractor
Dataset. This dataset contains 200,000 (Two hundred thousand)
observations and 5 columns. The dataset can be downloaded from
here.
The dataset includes the following columns:
This chunk verifies the installation of all the necessary packages to successfully run this vignette, if not, installs them and attach all the packages in the session environment.
list.of.packages <- c("dplyr", "kableExtra", "plotly", "purrr", "data.table", "gridExtra", "grid", "reactable", "reshape", "tidyr",
"stringr", "DT", "knitr", "feather")
new.packages <-
list.of.packages[!(list.of.packages %in% installed.packages()[, "Package"])]
if (length(new.packages))
install.packages(new.packages, dependencies = TRUE, verbose = FALSE, repos='https://cloud.r-project.org/')
invisible(lapply(list.of.packages, library, character.only = TRUE))# Sourcing required code scripts for HVT
script_dir <- "../R"
r_files <- list.files(script_dir, pattern = "\\.R$", full.names = TRUE)
invisible(lapply(r_files, function(file) { source(file, echo = FALSE); }))Here, we load the data. Let’s explore the Lorenz Attractor Dataset. For the sake of brevity, we are displaying only the first ten rows.
file_path <- ("./sample_dataset/lorenz_attractor.feather")
dataset <- read_feather(file_path) %>% as.data.frame()
dataset <- dataset %>% select(X,Y,Z,U,t)
dataset$t <- round(dataset$t, 5)
displayTable(head(dataset, 10))| X | Y | Z | U | t |
|---|---|---|---|---|
| 0.0000 | 1.0000 | 20.0000 | 0.0000 | 0.0000 |
| 0.0025 | 0.9998 | 19.9867 | 0.0005 | 0.0003 |
| 0.0050 | 0.9995 | 19.9734 | 0.0010 | 0.0005 |
| 0.0075 | 0.9993 | 19.9601 | 0.0015 | 0.0008 |
| 0.0099 | 0.9990 | 19.9468 | 0.0020 | 0.0010 |
| 0.0124 | 0.9988 | 19.9335 | 0.0025 | 0.0013 |
| 0.0149 | 0.9986 | 19.9202 | 0.0030 | 0.0015 |
| 0.0173 | 0.9984 | 19.9069 | 0.0035 | 0.0018 |
| 0.0198 | 0.9982 | 19.8937 | 0.0040 | 0.0020 |
| 0.0222 | 0.9980 | 19.8804 | 0.0045 | 0.0022 |
Now, let’s try to visualize the Lorenz attractor (overlapping spirals) in 3D Space.
data_3d <- dataset[sample(1:nrow(dataset), 1000), ]
plot_ly(data_3d, x= ~X, y= ~Y, z = ~Z) %>% add_markers( marker = list(
size = 2,
symbol = "circle",
color = ~Z,
colorscale = "Bluered",
colorbar = (list(title = 'Z'))))Figure 1: Lorenz attractor in 3D space
Now, let’s have a look at the structure of the Lorenz Attractor dataset.
str(dataset)## 'data.frame': 200000 obs. of 5 variables:
## $ X: num 0 0.0025 0.00499 0.00747 0.00995 ...
## $ Y: num 1 1 1 0.999 0.999 ...
## $ Z: num 20 20 20 20 19.9 ...
## $ U: num 0 0.0005 0.001 0.0015 0.002 ...
## $ t: num 0 0.00025 0.0005 0.00075 0.001 0.00125 0.0015 0.00175 0.002 0.00225 ...
Data distribution This section displays four objects.
Variable Histograms: The histogram distribution of all the features in the dataset.
Box Plots: Box plots for all the features in the dataset. These plots will display the median and Interquartile range of each column at a panel level.
Correlation Matrix: This calculates the Pearson correlation which is a bivariate correlation value measuring the linear correlation between two numeric columns. The output plot is shown as a matrix.
Summary EDA: The table provides descriptive statistics for all the features in the dataset.
Time Series Plots: Plots of all features (including time) against the time column.
It uses an inbuilt function called edaPlots to display
the above-mentioned four objects.
NOTE: The input dataset should be a data frame object and the columns should be only numeric type.
edaPlots(dataset, time_column = "t", output_type = "timeseries", n_cols = 5)edaPlots(dataset, output_type = 'summary', n_cols = 5)edaPlots(dataset, output_type = 'histogram', n_cols = 5)edaPlots(dataset, output_type = 'boxplot', n_cols = 5)edaPlots(dataset, output_type = 'correlation', n_cols = 5)Train - Test Split
Let us split the dataset into train and test. We will orderly select 80% of the data as train and the remaining as test.
noOfPoints <- dim(dataset)[1]
trainLength <- as.integer(noOfPoints * 0.8)
trainDataset <- dataset[1:trainLength,]
testDataset <- dataset[(trainLength+1):noOfPoints,]
rownames(testDataset) <- NULLLet’s have a look at the Training dataset containing 160,000 data points. For the sake of brevity, we are displaying the first 10 rows.
displayTable(head(trainDataset, 10))| X | Y | Z | U | t |
|---|---|---|---|---|
| 0.0000 | 1.0000 | 20.0000 | 0.0000 | 0.0000 |
| 0.0025 | 0.9998 | 19.9867 | 0.0005 | 0.0003 |
| 0.0050 | 0.9995 | 19.9734 | 0.0010 | 0.0005 |
| 0.0075 | 0.9993 | 19.9601 | 0.0015 | 0.0008 |
| 0.0099 | 0.9990 | 19.9468 | 0.0020 | 0.0010 |
| 0.0124 | 0.9988 | 19.9335 | 0.0025 | 0.0013 |
| 0.0149 | 0.9986 | 19.9202 | 0.0030 | 0.0015 |
| 0.0173 | 0.9984 | 19.9069 | 0.0035 | 0.0018 |
| 0.0198 | 0.9982 | 19.8937 | 0.0040 | 0.0020 |
| 0.0222 | 0.9980 | 19.8804 | 0.0045 | 0.0022 |
Now, let’s have a look at the structure of the training dataset.
str(trainDataset)## 'data.frame': 160000 obs. of 5 variables:
## $ X: num 0 0.0025 0.00499 0.00747 0.00995 ...
## $ Y: num 1 1 1 0.999 0.999 ...
## $ Z: num 20 20 20 20 19.9 ...
## $ U: num 0 0.0005 0.001 0.0015 0.002 ...
## $ t: num 0 0.00025 0.0005 0.00075 0.001 0.00125 0.0015 0.00175 0.002 0.00225 ...
Data Distribution
edaPlots(trainDataset, time_column = "t", output_type = "timeseries", n_cols = 5)edaPlots(trainDataset, output_type = 'summary', n_cols = 5)edaPlots(trainDataset, output_type = 'histogram', n_cols = 5)edaPlots(trainDataset, output_type = 'boxplot', n_cols = 5)edaPlots(trainDataset, output_type = 'correlation', n_cols = 5)Let’s have a look at the Testing dataset containing 40,000 data points. For the sake of brevity, we are displaying first the 10 rows.
displayTable(head(testDataset,10))| X | Y | Z | U | t |
|---|---|---|---|---|
| 16.0583 | 13.6588 | 39.5994 | 9.8935 | 40.0002 |
| 16.0523 | 13.6088 | 39.6278 | 9.8935 | 40.0004 |
| 16.0461 | 13.5587 | 39.6558 | 9.8934 | 40.0007 |
| 16.0399 | 13.5085 | 39.6837 | 9.8933 | 40.0010 |
| 16.0335 | 13.4582 | 39.7113 | 9.8932 | 40.0012 |
| 16.0270 | 13.4079 | 39.7386 | 9.8932 | 40.0014 |
| 16.0204 | 13.3575 | 39.7657 | 9.8931 | 40.0017 |
| 16.0137 | 13.3070 | 39.7926 | 9.8930 | 40.0020 |
| 16.0068 | 13.2564 | 39.8192 | 9.8929 | 40.0022 |
| 15.9999 | 13.2057 | 39.8456 | 9.8929 | 40.0025 |
Now, let’s have a look at the structure of the testing dataset.
str(testDataset)## 'data.frame': 40000 obs. of 5 variables:
## $ X: num 16.1 16.1 16 16 16 ...
## $ Y: num 13.7 13.6 13.6 13.5 13.5 ...
## $ Z: num 39.6 39.6 39.7 39.7 39.7 ...
## $ U: num 9.89 9.89 9.89 9.89 9.89 ...
## $ t: num 40 40 40 40 40 ...
Data Distribution
edaPlots(testDataset, time_column = "t", output_type = "timeseries", n_cols = 5)edaPlots(testDataset, output_type = 'summary', n_cols = 5)edaPlots(testDataset, output_type = 'histogram', n_cols = 5)edaPlots(testDataset, output_type = 'boxplot', n_cols = 5)edaPlots(testDataset, output_type = 'correlation', n_cols = 5)We will use the trainHVT function to compress our
dataset while preserving essential features.
Model Parameters
NOTE: The compression takes place only for the X, Y and Z coordinates and not for U(velocity) and t(Timestamp). After training & Scoring, we merge back the U and t columns with the dataset.
hvt.results <- trainHVT(
trainDataset[,-c(4:5)],
n_cells = 100,
depth = 1,
quant.err = 0.1,
normalize = TRUE,
distance_metric = "L1_Norm",
error_metric = "max",
quant_method = "kmeans",
dim_reduction_method = "sammon")## Initial stress : 0.00301
## stress after 10 iters: 0.00155, magic = 0.500
## stress after 20 iters: 0.00154, magic = 0.500
Let’s check out the compression summary.
displayTable(data = hvt.results[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 100 | 0 | 0 | n_cells: 100 quant.err: 0.1 distance_metric: L1_Norm error_metric: max quant_method: kmeans |
NOTE: Based on the provided table, it’s evident that the ‘percentOfCellsBelowQuantizationErrorThreshold’ value is zero, indicating that compression hasn’t taken place for the specified number of cells, which is 100. Typically, we would continue increasing the number of cells until at least 80% compression occurs. However, in this vignette demonstration, we’re not doing so, because the plots generated from temporal analysis functions would become cluttered and complex, making explanations less clear.
Let’s check out the model summary from trainHVT().
hvt.results$model_info$input_parameters## $input_dataset
## [1] "160000 Rows & 3 Columns"
##
## $n_cells
## [1] 100
##
## $depth
## [1] 1
##
## $quant.err
## [1] 0.1
##
## $normalize
## [1] TRUE
##
## $distance_metric
## [1] "L1_Norm"
##
## $error_metric
## [1] "max"
##
## $quant_method
## [1] "kmeans"
##
## $diagnose
## [1] FALSE
##
## $projection.scale
## [1] 10
##
## $hvt_validation
## [1] FALSE
##
## $train_validation_split_ratio
## [1] 0.8
Now, Let’s plot the Voronoi tessellation for 100 cells.
plotHVT(
hvt.results,
centroid.size = c(0.6),
plot.type = '2Dhvt',
cell_id = FALSE)Figure 2: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’
To understand how cell IDs are distributed across the map, we again
plot Voronoi tessellation with cell_id = TRUE.
plotHVT(
hvt.results,
centroid.size = c(0.6),
plot.type = '2Dhvt',
cell_id = TRUE)Figure 3: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’Lorenz attractor’ with Cell ID
Now, once we have built the model, let us try to score using the model.
NOTE: we are using the training dataset plus the testing dataset i.e., the raw dataset here for the demo purpose.
set.seed(240)
scoring <- scoreHVT(dataset,
hvt.results,
child.level = 1)Let’s see the output of scoreHVT() which has scored the
cells for all the data points in the dataset. For the sake of brevity,
we will only show the first 100 rows.
In the displayTable function, the ‘columnName’ argument
takes any of the column names that have to be highlighted greater than
the given ‘value’ based on the provided data.
displayTable(scoring$scoredPredictedData)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | centroidRadius | diff | anomalyFlag | X | Y | Z |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 89 | 1 | 55 | 0.0828 | 0.3469 | 0.2641 | 0 | -0.1084 | 0.0120 | -0.3780 |
| 1 | 1 | 89 | 1 | 55 | 0.0821 | 0.3469 | 0.2647 | 0 | -0.1081 | 0.0120 | -0.3795 |
| 1 | 1 | 89 | 1 | 55 | 0.0815 | 0.3469 | 0.2653 | 0 | -0.1078 | 0.0120 | -0.3810 |
| 1 | 1 | 89 | 1 | 55 | 0.0809 | 0.3469 | 0.2660 | 0 | -0.1075 | 0.0120 | -0.3826 |
| 1 | 1 | 89 | 1 | 55 | 0.0803 | 0.3469 | 0.2666 | 0 | -0.1072 | 0.0119 | -0.3841 |
| 1 | 1 | 89 | 1 | 55 | 0.0797 | 0.3469 | 0.2672 | 0 | -0.1068 | 0.0119 | -0.3856 |
| 1 | 1 | 89 | 1 | 55 | 0.079 | 0.3469 | 0.2678 | 0 | -0.1065 | 0.0119 | -0.3871 |
| 1 | 1 | 89 | 1 | 55 | 0.0784 | 0.3469 | 0.2684 | 0 | -0.1062 | 0.0119 | -0.3886 |
| 1 | 1 | 89 | 1 | 55 | 0.0778 | 0.3469 | 0.2691 | 0 | -0.1059 | 0.0118 | -0.3901 |
| 1 | 1 | 89 | 1 | 55 | 0.0772 | 0.3469 | 0.2697 | 0 | -0.1056 | 0.0118 | -0.3916 |
| 1 | 1 | 89 | 1 | 55 | 0.0766 | 0.3469 | 0.2703 | 0 | -0.1053 | 0.0118 | -0.3931 |
| 1 | 1 | 89 | 1 | 55 | 0.076 | 0.3469 | 0.2709 | 0 | -0.1050 | 0.0118 | -0.3947 |
| 1 | 1 | 89 | 1 | 55 | 0.0754 | 0.3469 | 0.2715 | 0 | -0.1047 | 0.0117 | -0.3962 |
| 1 | 1 | 89 | 1 | 55 | 0.0747 | 0.3469 | 0.2721 | 0 | -0.1044 | 0.0117 | -0.3977 |
| 1 | 1 | 89 | 1 | 55 | 0.0741 | 0.3469 | 0.2727 | 0 | -0.1040 | 0.0117 | -0.3992 |
| 1 | 1 | 89 | 1 | 55 | 0.0735 | 0.3469 | 0.2733 | 0 | -0.1037 | 0.0117 | -0.4007 |
| 1 | 1 | 89 | 1 | 55 | 0.0729 | 0.3469 | 0.2739 | 0 | -0.1034 | 0.0117 | -0.4022 |
| 1 | 1 | 89 | 1 | 55 | 0.0723 | 0.3469 | 0.2746 | 0 | -0.1031 | 0.0116 | -0.4037 |
| 1 | 1 | 89 | 1 | 55 | 0.0717 | 0.3469 | 0.2752 | 0 | -0.1028 | 0.0116 | -0.4052 |
| 1 | 1 | 89 | 1 | 55 | 0.0711 | 0.3469 | 0.2758 | 0 | -0.1025 | 0.0116 | -0.4067 |
| 1 | 1 | 89 | 1 | 55 | 0.0705 | 0.3469 | 0.2764 | 0 | -0.1022 | 0.0116 | -0.4082 |
| 1 | 1 | 89 | 1 | 55 | 0.0699 | 0.3469 | 0.2770 | 0 | -0.1019 | 0.0116 | -0.4097 |
| 1 | 1 | 89 | 1 | 55 | 0.0693 | 0.3469 | 0.2776 | 0 | -0.1016 | 0.0116 | -0.4112 |
| 1 | 1 | 89 | 1 | 55 | 0.0687 | 0.3469 | 0.2782 | 0 | -0.1013 | 0.0115 | -0.4127 |
| 1 | 1 | 89 | 1 | 55 | 0.0681 | 0.3469 | 0.2788 | 0 | -0.1010 | 0.0115 | -0.4142 |
| 1 | 1 | 89 | 1 | 55 | 0.0675 | 0.3469 | 0.2794 | 0 | -0.1007 | 0.0115 | -0.4156 |
| 1 | 1 | 89 | 1 | 55 | 0.0669 | 0.3469 | 0.2800 | 0 | -0.1004 | 0.0115 | -0.4171 |
| 1 | 1 | 89 | 1 | 55 | 0.0663 | 0.3469 | 0.2806 | 0 | -0.1001 | 0.0115 | -0.4186 |
| 1 | 1 | 89 | 1 | 55 | 0.0657 | 0.3469 | 0.2812 | 0 | -0.0998 | 0.0115 | -0.4201 |
| 1 | 1 | 89 | 1 | 55 | 0.0651 | 0.3469 | 0.2818 | 0 | -0.0995 | 0.0115 | -0.4216 |
| 1 | 1 | 89 | 1 | 55 | 0.0645 | 0.3469 | 0.2824 | 0 | -0.0992 | 0.0115 | -0.4231 |
| 1 | 1 | 89 | 1 | 55 | 0.0639 | 0.3469 | 0.2830 | 0 | -0.0989 | 0.0114 | -0.4246 |
| 1 | 1 | 89 | 1 | 55 | 0.0633 | 0.3469 | 0.2836 | 0 | -0.0987 | 0.0114 | -0.4261 |
| 1 | 1 | 89 | 1 | 55 | 0.0627 | 0.3469 | 0.2842 | 0 | -0.0984 | 0.0114 | -0.4275 |
| 1 | 1 | 89 | 1 | 55 | 0.0621 | 0.3469 | 0.2848 | 0 | -0.0981 | 0.0114 | -0.4290 |
| 1 | 1 | 89 | 1 | 55 | 0.0615 | 0.3469 | 0.2854 | 0 | -0.0978 | 0.0114 | -0.4305 |
| 1 | 1 | 89 | 1 | 55 | 0.0609 | 0.3469 | 0.2860 | 0 | -0.0975 | 0.0114 | -0.4320 |
| 1 | 1 | 89 | 1 | 55 | 0.0603 | 0.3469 | 0.2865 | 0 | -0.0972 | 0.0114 | -0.4335 |
| 1 | 1 | 89 | 1 | 55 | 0.0597 | 0.3469 | 0.2871 | 0 | -0.0969 | 0.0114 | -0.4350 |
| 1 | 1 | 89 | 1 | 55 | 0.0591 | 0.3469 | 0.2877 | 0 | -0.0966 | 0.0114 | -0.4364 |
| 1 | 1 | 89 | 1 | 55 | 0.0585 | 0.3469 | 0.2883 | 0 | -0.0963 | 0.0114 | -0.4379 |
| 1 | 1 | 89 | 1 | 55 | 0.058 | 0.3469 | 0.2889 | 0 | -0.0961 | 0.0114 | -0.4394 |
| 1 | 1 | 89 | 1 | 55 | 0.0574 | 0.3469 | 0.2895 | 0 | -0.0958 | 0.0114 | -0.4409 |
| 1 | 1 | 89 | 1 | 55 | 0.0568 | 0.3469 | 0.2901 | 0 | -0.0955 | 0.0114 | -0.4423 |
| 1 | 1 | 89 | 1 | 55 | 0.0562 | 0.3469 | 0.2907 | 0 | -0.0952 | 0.0114 | -0.4438 |
| 1 | 1 | 89 | 1 | 55 | 0.0556 | 0.3469 | 0.2913 | 0 | -0.0949 | 0.0114 | -0.4453 |
| 1 | 1 | 89 | 1 | 55 | 0.055 | 0.3469 | 0.2918 | 0 | -0.0946 | 0.0114 | -0.4467 |
| 1 | 1 | 89 | 1 | 55 | 0.0544 | 0.3469 | 0.2924 | 0 | -0.0944 | 0.0114 | -0.4482 |
| 1 | 1 | 89 | 1 | 55 | 0.0539 | 0.3469 | 0.2930 | 0 | -0.0941 | 0.0114 | -0.4497 |
| 1 | 1 | 89 | 1 | 55 | 0.0533 | 0.3469 | 0.2936 | 0 | -0.0938 | 0.0114 | -0.4512 |
| 1 | 1 | 89 | 1 | 55 | 0.0527 | 0.3469 | 0.2942 | 0 | -0.0935 | 0.0114 | -0.4526 |
| 1 | 1 | 89 | 1 | 55 | 0.0521 | 0.3469 | 0.2948 | 0 | -0.0932 | 0.0114 | -0.4541 |
| 1 | 1 | 89 | 1 | 55 | 0.0515 | 0.3469 | 0.2953 | 0 | -0.0930 | 0.0114 | -0.4556 |
| 1 | 1 | 89 | 1 | 55 | 0.0509 | 0.3469 | 0.2959 | 0 | -0.0927 | 0.0114 | -0.4570 |
| 1 | 1 | 89 | 1 | 55 | 0.0504 | 0.3469 | 0.2965 | 0 | -0.0924 | 0.0114 | -0.4585 |
| 1 | 1 | 89 | 1 | 55 | 0.0498 | 0.3469 | 0.2971 | 0 | -0.0921 | 0.0114 | -0.4599 |
| 1 | 1 | 89 | 1 | 55 | 0.0492 | 0.3469 | 0.2976 | 0 | -0.0919 | 0.0114 | -0.4614 |
| 1 | 1 | 89 | 1 | 55 | 0.0486 | 0.3469 | 0.2982 | 0 | -0.0916 | 0.0114 | -0.4629 |
| 1 | 1 | 89 | 1 | 55 | 0.0481 | 0.3469 | 0.2988 | 0 | -0.0913 | 0.0114 | -0.4643 |
| 1 | 1 | 89 | 1 | 55 | 0.0475 | 0.3469 | 0.2994 | 0 | -0.0910 | 0.0114 | -0.4658 |
| 1 | 1 | 89 | 1 | 55 | 0.0469 | 0.3469 | 0.2999 | 0 | -0.0908 | 0.0114 | -0.4672 |
| 1 | 1 | 89 | 1 | 55 | 0.0463 | 0.3469 | 0.3005 | 0 | -0.0905 | 0.0114 | -0.4687 |
| 1 | 1 | 89 | 1 | 55 | 0.0458 | 0.3469 | 0.3011 | 0 | -0.0902 | 0.0114 | -0.4701 |
| 1 | 1 | 89 | 1 | 55 | 0.0452 | 0.3469 | 0.3017 | 0 | -0.0899 | 0.0114 | -0.4716 |
| 1 | 1 | 89 | 1 | 55 | 0.0446 | 0.3469 | 0.3022 | 0 | -0.0897 | 0.0114 | -0.4730 |
| 1 | 1 | 89 | 1 | 55 | 0.044 | 0.3469 | 0.3028 | 0 | -0.0894 | 0.0114 | -0.4745 |
| 1 | 1 | 89 | 1 | 55 | 0.0435 | 0.3469 | 0.3034 | 0 | -0.0891 | 0.0114 | -0.4759 |
| 1 | 1 | 89 | 1 | 55 | 0.0429 | 0.3469 | 0.3039 | 0 | -0.0889 | 0.0114 | -0.4774 |
| 1 | 1 | 89 | 1 | 55 | 0.0423 | 0.3469 | 0.3045 | 0 | -0.0886 | 0.0115 | -0.4788 |
| 1 | 1 | 89 | 1 | 55 | 0.0418 | 0.3469 | 0.3051 | 0 | -0.0883 | 0.0115 | -0.4803 |
| 1 | 1 | 89 | 1 | 55 | 0.0412 | 0.3469 | 0.3056 | 0 | -0.0881 | 0.0115 | -0.4817 |
| 1 | 1 | 89 | 1 | 55 | 0.0406 | 0.3469 | 0.3062 | 0 | -0.0878 | 0.0115 | -0.4832 |
| 1 | 1 | 89 | 1 | 55 | 0.0401 | 0.3469 | 0.3068 | 0 | -0.0875 | 0.0115 | -0.4846 |
| 1 | 1 | 89 | 1 | 55 | 0.0395 | 0.3469 | 0.3073 | 0 | -0.0873 | 0.0115 | -0.4861 |
| 1 | 1 | 89 | 1 | 55 | 0.0389 | 0.3469 | 0.3079 | 0 | -0.0870 | 0.0115 | -0.4875 |
| 1 | 1 | 89 | 1 | 55 | 0.0384 | 0.3469 | 0.3085 | 0 | -0.0867 | 0.0115 | -0.4889 |
| 1 | 1 | 89 | 1 | 55 | 0.0378 | 0.3469 | 0.3090 | 0 | -0.0865 | 0.0116 | -0.4904 |
| 1 | 1 | 89 | 1 | 55 | 0.0373 | 0.3469 | 0.3096 | 0 | -0.0862 | 0.0116 | -0.4918 |
| 1 | 1 | 89 | 1 | 55 | 0.0367 | 0.3469 | 0.3102 | 0 | -0.0860 | 0.0116 | -0.4933 |
| 1 | 1 | 89 | 1 | 55 | 0.0361 | 0.3469 | 0.3107 | 0 | -0.0857 | 0.0116 | -0.4947 |
| 1 | 1 | 89 | 1 | 55 | 0.0356 | 0.3469 | 0.3113 | 0 | -0.0854 | 0.0116 | -0.4961 |
| 1 | 1 | 89 | 1 | 55 | 0.035 | 0.3469 | 0.3118 | 0 | -0.0852 | 0.0116 | -0.4976 |
| 1 | 1 | 89 | 1 | 55 | 0.0345 | 0.3469 | 0.3124 | 0 | -0.0849 | 0.0117 | -0.4990 |
| 1 | 1 | 89 | 1 | 55 | 0.0339 | 0.3469 | 0.3130 | 0 | -0.0847 | 0.0117 | -0.5004 |
| 1 | 1 | 89 | 1 | 55 | 0.0333 | 0.3469 | 0.3135 | 0 | -0.0844 | 0.0117 | -0.5019 |
| 1 | 1 | 89 | 1 | 55 | 0.0328 | 0.3469 | 0.3141 | 0 | -0.0841 | 0.0117 | -0.5033 |
| 1 | 1 | 89 | 1 | 55 | 0.0322 | 0.3469 | 0.3146 | 0 | -0.0839 | 0.0117 | -0.5047 |
| 1 | 1 | 89 | 1 | 55 | 0.0317 | 0.3469 | 0.3152 | 0 | -0.0836 | 0.0118 | -0.5062 |
| 1 | 1 | 89 | 1 | 55 | 0.0311 | 0.3469 | 0.3157 | 0 | -0.0834 | 0.0118 | -0.5076 |
| 1 | 1 | 89 | 1 | 55 | 0.0306 | 0.3469 | 0.3163 | 0 | -0.0831 | 0.0118 | -0.5090 |
| 1 | 1 | 89 | 1 | 55 | 0.03 | 0.3469 | 0.3168 | 0 | -0.0829 | 0.0118 | -0.5104 |
| 1 | 1 | 89 | 1 | 55 | 0.0295 | 0.3469 | 0.3174 | 0 | -0.0826 | 0.0119 | -0.5119 |
| 1 | 1 | 89 | 1 | 55 | 0.0289 | 0.3469 | 0.3179 | 0 | -0.0824 | 0.0119 | -0.5133 |
| 1 | 1 | 89 | 1 | 55 | 0.0284 | 0.3469 | 0.3185 | 0 | -0.0821 | 0.0119 | -0.5147 |
| 1 | 1 | 89 | 1 | 55 | 0.0278 | 0.3469 | 0.3190 | 0 | -0.0819 | 0.0119 | -0.5161 |
| 1 | 1 | 89 | 1 | 55 | 0.0273 | 0.3469 | 0.3196 | 0 | -0.0816 | 0.0120 | -0.5175 |
| 1 | 1 | 89 | 1 | 55 | 0.0267 | 0.3469 | 0.3201 | 0 | -0.0814 | 0.0120 | -0.5190 |
| 1 | 1 | 89 | 1 | 55 | 0.0262 | 0.3469 | 0.3207 | 0 | -0.0811 | 0.0120 | -0.5204 |
| 1 | 1 | 89 | 1 | 55 | 0.0256 | 0.3469 | 0.3212 | 0 | -0.0809 | 0.0120 | -0.5218 |
| 1 | 1 | 89 | 1 | 55 | 0.0251 | 0.3469 | 0.3218 | 0 | -0.0806 | 0.0121 | -0.5232 |
Let’s look at the scored model summary from
scoreHVT()
The model info displays five attributes which are explained below:
scoring$model_info$scored_model_summary## $input_dataset
## [1] "200000 Rows & 5 Columns"
##
## $scored_qe_range
## [1] "4e-04 to 0.3021"
##
## $mad.threshold
## [1] 0.2
##
## $no_of_anomaly_datapoints
## [1] 1353
##
## $no_of_anomaly_cells
## [1] 17
Now, let’s merge back the U and t columns from the
dataset to the scoring output and prepare the dataset for ‘Temporal
Analysis’ Functions.
temporal_data <- cbind(scoring$scoredPredictedData, dataset[,c(4,5)]) %>% select(Cell.ID,t)Let’s comprehend the function plotStateTransition which
is used to create a time series plotly object.
plotStateTransition(
df,
sample_size,
line_plot,
cellid_column,
time_column)df - A data frame contains Cell ID
and Timestamps.
sample_size - A numeric value to
specify the sampling value which ranges between 0.1 to 1. The highest
value 1, outputs a plot with the entire dataset. Sampling of data takes
place from the last to the first.
line_plot - A Logical value. If
TRUE, the output will be a timeseries plot with a line connecting the
states according to the sample_size. If FALSE, a timeseries plot but
without a line based on the sample_size will be the output.
cellid_column - A character string
specifying the column name of the Cell ID.
time_column - A character string
specifying the column name of the time stamp.
plotStateTransition(df = temporal_data,
cellid_column = "Cell.ID",
time_column = "t",
sample_size = 0.2)For the demo of ‘sample_size’ argument, we are replicating the same plot above with the entire dataset.
plotStateTransition(df = temporal_data,
cellid_column = "Cell.ID",
time_column = "t",
sample_size = 1)getTransitionProbability(
df,
cellid_column,
time_column)df - A data frame contains Cell ID
and Timestamps.
cellid_column - A character string
specifying the column name of the Cell ID.
time_column - A character string
specifying the column name of the time stamp.
trans_table <- getTransitionProbability(df = temporal_data,
cellid_column = "Cell.ID",
time_column = "t")NOTE: The output is stored as a nested list. For the purpose of demo, here we are displaying it as data frame with the first 10 rows.
combined_df <- do.call(rbind, trans_table)
displayTable(head(combined_df,10))| Current_State | Next_State | Relative_Frequency | Transition_Probability |
|---|---|---|---|
| 1 | 1 | 1161 | 0.9856 |
| 1 | 2 | 15 | 0.0127 |
| 1 | 7 | 2 | 0.0017 |
| 2 | 2 | 1382 | 0.9864 |
| 2 | 5 | 16 | 0.0114 |
| 2 | 10 | 3 | 0.0021 |
| 3 | 1 | 16 | 0.0097 |
| 3 | 3 | 1618 | 0.9842 |
| 3 | 7 | 10 | 0.0061 |
| 4 | 3 | 22 | 0.0131 |
Current_State: The cell (out of 100 cells given in model training) in which the datapoint resides at a given time (t).
Next_State: The cell (out of 100 cells given in model training) to which the datapoint moves at the next time unit (t+1).
Relative_Frequency: The number of times that the
datapoint moves from that Current_State to that
Next_State.
Transition_Probability: The probability calculated from
the Relative_Frequency. Individual
Relative_Frequency divided by the total of
Relative_Frequency for a particular
Current_State.
Reconciling the transition probability of the current states to the next states manually and markovchain function considering self states and without self states.
reconcileTransitionProbability(
df,
hmap_type = "All",
cellid_column,
time_column)df - A data frame contains Cell ID
and Time stamps.
hmap_type - A character string, if
set to ‘without_self_state’, reconciliation plots for manual and
markovchain for the highest transition probability excluding the
self-state are given as output. If set to ‘self_state’, reconciliation
plots for manual and markovchain for the highest transition probability
considering the self-state is given as output and if set to ‘All’, plots
including and excluding self-state are given as output.
cellid_column - A character string
specifying the column name of the Cell ID.
time_column - A character string
specifying the column name of the time stamp.
reconcile_plots <- reconcileTransitionProbability(df = temporal_data,
hmap_type = "All",
cellid_column = "Cell.ID",
time_column = "t")Reconciliation plots of transition probability with self_state
The transition probability of one state staying in the same state is calculated using manual calculations and the markovchain function is plotted for comparison. The darker diagonal cells indicate higher probabilities of cells staying in the same state.
reconcile_plots[[1]]Reconciliation table of transition probability with self-state
displayTable(reconcile_plots[[2]], limit = 217)| Current_State | Next_State_manual | Next_State_markov | Probability_manual_calculation | Probability_markov_function | diff |
|---|---|---|---|---|---|
| 1 | 2 | 2 | 0.0127 | 0.0127 | 0 |
| 1 | 7 | 7 | 0.0017 | 0.0017 | 0 |
| 2 | 5 | 5 | 0.0114 | 0.0114 | 0 |
| 2 | 10 | 10 | 0.0021 | 0.0021 | 0 |
| 3 | 1 | 1 | 0.0097 | 0.0097 | 0 |
| 3 | 7 | 7 | 0.0061 | 0.0061 | 0 |
| 4 | 3 | 3 | 0.0131 | 0.0131 | 0 |
| 4 | 9 | 9 | 0.0024 | 0.0024 | 0 |
| 5 | 10 | 10 | 0.0008 | 0.0008 | 0 |
| 5 | 11 | 11 | 0.0116 | 0.0116 | 0 |
| 6 | 4 | 4 | 0.0144 | 0.0144 | 0 |
| 6 | 14 | 14 | 0.0006 | 0.0006 | 0 |
| 7 | 1 | 1 | 0.0006 | 0.0006 | 0 |
| 7 | 2 | 2 | 0.0023 | 0.0023 | 0 |
| 7 | 10 | 10 | 0.0053 | 0.0053 | 0 |
| 7 | 15 | 15 | 0.0012 | 0.0012 | 0 |
| 8 | 6 | 6 | 0.0148 | 0.0148 | 0 |
| 8 | 13 | 13 | 0.0009 | 0.0009 | 0 |
| 9 | 3 | 3 | 0.0026 | 0.0026 | 0 |
| 9 | 7 | 7 | 0.0026 | 0.0026 | 0 |
| 9 | 15 | 15 | 0.0039 | 0.0039 | 0 |
| 10 | 11 | 11 | 0.0021 | 0.0021 | 0 |
| 10 | 19 | 19 | 0.0053 | 0.0053 | 0 |
| 10 | 20 | 20 | 0.0005 | 0.0005 | 0 |
| 11 | 19 | 19 | 0.0032 | 0.0032 | 0 |
| 11 | 21 | 21 | 0.0089 | 0.0089 | 0 |
| 12 | 8 | 8 | 0.0100 | 0.0100 | 0 |
| 12 | 13 | 13 | 0.0036 | 0.0036 | 0 |
| 13 | 6 | 6 | 0.0049 | 0.0049 | 0 |
| 13 | 8 | 8 | 0.0025 | 0.0025 | 0 |
| 13 | 14 | 14 | 0.0031 | 0.0031 | 0 |
| 13 | 18 | 18 | 0.0006 | 0.0006 | 0 |
| 14 | 4 | 4 | 0.0013 | 0.0013 | 0 |
| 14 | 9 | 9 | 0.0065 | 0.0065 | 0 |
| 15 | 10 | 10 | 0.0017 | 0.0017 | 0 |
| 15 | 20 | 20 | 0.0050 | 0.0050 | 0 |
| 16 | 12 | 12 | 0.0102 | 0.0102 | 0 |
| 16 | 17 | 17 | 0.0007 | 0.0007 | 0 |
| 17 | 12 | 12 | 0.0023 | 0.0023 | 0 |
| 17 | 13 | 13 | 0.0040 | 0.0040 | 0 |
| 17 | 18 | 18 | 0.0023 | 0.0023 | 0 |
| 18 | 13 | 13 | 0.0038 | 0.0038 | 0 |
| 18 | 14 | 14 | 0.0046 | 0.0046 | 0 |
| 19 | 21 | 21 | 0.0022 | 0.0022 | 0 |
| 19 | 26 | 26 | 0.0033 | 0.0033 | 0 |
| 19 | 27 | 27 | 0.0033 | 0.0033 | 0 |
| 20 | 19 | 19 | 0.0008 | 0.0008 | 0 |
| 20 | 26 | 26 | 0.0051 | 0.0051 | 0 |
| 21 | 27 | 27 | 0.0124 | 0.0124 | 0 |
| 22 | 16 | 16 | 0.0102 | 0.0102 | 0 |
| 22 | 23 | 23 | 0.0006 | 0.0006 | 0 |
| 23 | 17 | 17 | 0.0073 | 0.0073 | 0 |
| 23 | 22 | 22 | 0.0005 | 0.0005 | 0 |
| 24 | 18 | 18 | 0.0043 | 0.0043 | 0 |
| 24 | 23 | 23 | 0.0007 | 0.0007 | 0 |
| 25 | 22 | 22 | 0.0101 | 0.0101 | 0 |
| 25 | 28 | 28 | 0.0007 | 0.0007 | 0 |
| 26 | 27 | 27 | 0.0006 | 0.0006 | 0 |
| 26 | 32 | 32 | 0.0056 | 0.0056 | 0 |
| 26 | 33 | 33 | 0.0011 | 0.0011 | 0 |
| 27 | 26 | 26 | 0.0004 | 0.0004 | 0 |
| 27 | 33 | 33 | 0.0106 | 0.0106 | 0 |
| 28 | 22 | 22 | 0.0005 | 0.0005 | 0 |
| 28 | 23 | 23 | 0.0047 | 0.0047 | 0 |
| 28 | 25 | 25 | 0.0028 | 0.0028 | 0 |
| 29 | 23 | 23 | 0.0018 | 0.0018 | 0 |
| 29 | 24 | 24 | 0.0043 | 0.0043 | 0 |
| 30 | 25 | 25 | 0.0052 | 0.0052 | 0 |
| 30 | 28 | 28 | 0.0042 | 0.0042 | 0 |
| 31 | 29 | 29 | 0.0034 | 0.0034 | 0 |
| 32 | 37 | 37 | 0.0027 | 0.0027 | 0 |
| 32 | 40 | 40 | 0.0048 | 0.0048 | 0 |
| 33 | 32 | 32 | 0.0016 | 0.0016 | 0 |
| 33 | 40 | 40 | 0.0012 | 0.0012 | 0 |
| 33 | 44 | 44 | 0.0077 | 0.0077 | 0 |
| 34 | 28 | 28 | 0.0032 | 0.0032 | 0 |
| 34 | 30 | 30 | 0.0032 | 0.0032 | 0 |
| 35 | 30 | 30 | 0.0050 | 0.0050 | 0 |
| 35 | 34 | 34 | 0.0025 | 0.0025 | 0 |
| 36 | 29 | 29 | 0.0031 | 0.0031 | 0 |
| 36 | 31 | 31 | 0.0006 | 0.0006 | 0 |
| 36 | 34 | 34 | 0.0013 | 0.0013 | 0 |
| 37 | 39 | 39 | 0.0035 | 0.0035 | 0 |
| 37 | 48 | 48 | 0.0017 | 0.0017 | 0 |
| 38 | 31 | 31 | 0.0011 | 0.0011 | 0 |
| 38 | 36 | 36 | 0.0033 | 0.0033 | 0 |
| 39 | 31 | 31 | 0.0010 | 0.0010 | 0 |
| 39 | 38 | 38 | 0.0019 | 0.0019 | 0 |
| 39 | 46 | 46 | 0.0014 | 0.0014 | 0 |
| 40 | 37 | 37 | 0.0018 | 0.0018 | 0 |
| 40 | 48 | 48 | 0.0036 | 0.0036 | 0 |
| 40 | 56 | 56 | 0.0036 | 0.0036 | 0 |
| 41 | 34 | 34 | 0.0020 | 0.0020 | 0 |
| 41 | 35 | 35 | 0.0024 | 0.0024 | 0 |
| 42 | 35 | 35 | 0.0050 | 0.0050 | 0 |
| 43 | 34 | 34 | 0.0014 | 0.0014 | 0 |
| 43 | 36 | 36 | 0.0007 | 0.0007 | 0 |
| 43 | 41 | 41 | 0.0021 | 0.0021 | 0 |
| 44 | 40 | 40 | 0.0042 | 0.0042 | 0 |
| 44 | 56 | 56 | 0.0058 | 0.0058 | 0 |
| 45 | 39 | 39 | 0.0024 | 0.0024 | 0 |
| 45 | 53 | 53 | 0.0024 | 0.0024 | 0 |
| 46 | 38 | 38 | 0.0019 | 0.0019 | 0 |
| 46 | 47 | 47 | 0.0033 | 0.0033 | 0 |
| 47 | 43 | 43 | 0.0034 | 0.0034 | 0 |
| 47 | 52 | 52 | 0.0010 | 0.0010 | 0 |
| 48 | 50 | 50 | 0.0059 | 0.0059 | 0 |
| 48 | 58 | 58 | 0.0017 | 0.0017 | 0 |
| 49 | 42 | 42 | 0.0040 | 0.0040 | 0 |
| 50 | 46 | 46 | 0.0015 | 0.0015 | 0 |
| 50 | 54 | 54 | 0.0037 | 0.0037 | 0 |
| 50 | 62 | 62 | 0.0011 | 0.0011 | 0 |
| 51 | 41 | 41 | 0.0016 | 0.0016 | 0 |
| 51 | 49 | 49 | 0.0030 | 0.0030 | 0 |
| 52 | 43 | 43 | 0.0006 | 0.0006 | 0 |
| 52 | 51 | 51 | 0.0042 | 0.0042 | 0 |
| 52 | 63 | 63 | 0.0009 | 0.0009 | 0 |
| 53 | 39 | 39 | 0.0003 | 0.0003 | 0 |
| 53 | 46 | 46 | 0.0013 | 0.0013 | 0 |
| 53 | 54 | 54 | 0.0036 | 0.0036 | 0 |
| 54 | 47 | 47 | 0.0008 | 0.0008 | 0 |
| 54 | 55 | 55 | 0.0045 | 0.0045 | 0 |
| 54 | 60 | 60 | 0.0011 | 0.0011 | 0 |
| 55 | 47 | 47 | 0.0009 | 0.0009 | 0 |
| 55 | 52 | 52 | 0.0040 | 0.0040 | 0 |
| 55 | 64 | 64 | 0.0009 | 0.0009 | 0 |
| 56 | 48 | 48 | 0.0031 | 0.0031 | 0 |
| 56 | 58 | 58 | 0.0054 | 0.0054 | 0 |
| 57 | 45 | 45 | 0.0008 | 0.0008 | 0 |
| 57 | 53 | 53 | 0.0056 | 0.0056 | 0 |
| 58 | 50 | 50 | 0.0012 | 0.0012 | 0 |
| 58 | 62 | 62 | 0.0050 | 0.0050 | 0 |
| 59 | 45 | 45 | 0.0009 | 0.0009 | 0 |
| 59 | 57 | 57 | 0.0066 | 0.0066 | 0 |
| 59 | 65 | 65 | 0.0009 | 0.0009 | 0 |
| 60 | 55 | 55 | 0.0005 | 0.0005 | 0 |
| 60 | 64 | 64 | 0.0043 | 0.0043 | 0 |
| 61 | 60 | 60 | 0.0036 | 0.0036 | 0 |
| 61 | 66 | 66 | 0.0008 | 0.0008 | 0 |
| 62 | 54 | 54 | 0.0007 | 0.0007 | 0 |
| 62 | 60 | 60 | 0.0017 | 0.0017 | 0 |
| 62 | 66 | 66 | 0.0030 | 0.0030 | 0 |
| 63 | 67 | 67 | 0.0011 | 0.0011 | 0 |
| 63 | 68 | 68 | 0.0004 | 0.0004 | 0 |
| 63 | 72 | 72 | 0.0008 | 0.0008 | 0 |
| 64 | 52 | 52 | 0.0007 | 0.0007 | 0 |
| 64 | 63 | 63 | 0.0004 | 0.0004 | 0 |
| 64 | 68 | 68 | 0.0031 | 0.0031 | 0 |
| 65 | 57 | 57 | 0.0009 | 0.0009 | 0 |
| 65 | 61 | 61 | 0.0051 | 0.0051 | 0 |
| 66 | 71 | 71 | 0.0040 | 0.0040 | 0 |
| 67 | 72 | 72 | 0.0006 | 0.0006 | 0 |
| 67 | 76 | 76 | 0.0011 | 0.0011 | 0 |
| 68 | 63 | 63 | 0.0003 | 0.0003 | 0 |
| 68 | 72 | 72 | 0.0026 | 0.0026 | 0 |
| 68 | 73 | 73 | 0.0010 | 0.0010 | 0 |
| 69 | 59 | 59 | 0.0053 | 0.0053 | 0 |
| 69 | 70 | 70 | 0.0043 | 0.0043 | 0 |
| 70 | 59 | 59 | 0.0036 | 0.0036 | 0 |
| 70 | 65 | 65 | 0.0050 | 0.0050 | 0 |
| 71 | 74 | 74 | 0.0039 | 0.0039 | 0 |
| 72 | 73 | 73 | 0.0033 | 0.0033 | 0 |
| 72 | 76 | 76 | 0.0008 | 0.0008 | 0 |
| 72 | 77 | 77 | 0.0012 | 0.0012 | 0 |
| 73 | 77 | 77 | 0.0033 | 0.0033 | 0 |
| 73 | 80 | 80 | 0.0016 | 0.0016 | 0 |
| 74 | 79 | 79 | 0.0041 | 0.0041 | 0 |
| 75 | 69 | 69 | 0.0055 | 0.0055 | 0 |
| 75 | 70 | 70 | 0.0055 | 0.0055 | 0 |
| 76 | 77 | 77 | 0.0029 | 0.0029 | 0 |
| 76 | 82 | 82 | 0.0010 | 0.0010 | 0 |
| 77 | 80 | 80 | 0.0050 | 0.0050 | 0 |
| 77 | 82 | 82 | 0.0020 | 0.0020 | 0 |
| 78 | 69 | 69 | 0.0040 | 0.0040 | 0 |
| 78 | 75 | 75 | 0.0063 | 0.0063 | 0 |
| 79 | 83 | 83 | 0.0050 | 0.0050 | 0 |
| 80 | 84 | 84 | 0.0057 | 0.0057 | 0 |
| 81 | 75 | 75 | 0.0061 | 0.0061 | 0 |
| 81 | 78 | 78 | 0.0056 | 0.0056 | 0 |
| 82 | 84 | 84 | 0.0047 | 0.0047 | 0 |
| 83 | 86 | 86 | 0.0058 | 0.0058 | 0 |
| 83 | 88 | 88 | 0.0016 | 0.0016 | 0 |
| 84 | 83 | 83 | 0.0014 | 0.0014 | 0 |
| 84 | 88 | 88 | 0.0072 | 0.0072 | 0 |
| 85 | 78 | 78 | 0.0047 | 0.0047 | 0 |
| 85 | 81 | 81 | 0.0058 | 0.0058 | 0 |
| 86 | 89 | 89 | 0.0062 | 0.0062 | 0 |
| 86 | 90 | 90 | 0.0040 | 0.0040 | 0 |
| 87 | 81 | 81 | 0.0067 | 0.0067 | 0 |
| 87 | 85 | 85 | 0.0037 | 0.0037 | 0 |
| 88 | 86 | 86 | 0.0037 | 0.0037 | 0 |
| 88 | 90 | 90 | 0.0059 | 0.0059 | 0 |
| 89 | 93 | 93 | 0.0068 | 0.0068 | 0 |
| 89 | 96 | 96 | 0.0049 | 0.0049 | 0 |
| 90 | 89 | 89 | 0.0049 | 0.0049 | 0 |
| 90 | 96 | 96 | 0.0061 | 0.0061 | 0 |
| 91 | 85 | 85 | 0.0074 | 0.0074 | 0 |
| 91 | 87 | 87 | 0.0037 | 0.0037 | 0 |
| 92 | 87 | 87 | 0.0066 | 0.0066 | 0 |
| 92 | 91 | 91 | 0.0024 | 0.0024 | 0 |
| 92 | 97 | 97 | 0.0006 | 0.0006 | 0 |
| 93 | 95 | 95 | 0.0069 | 0.0069 | 0 |
| 93 | 98 | 98 | 0.0063 | 0.0063 | 0 |
| 94 | 92 | 92 | 0.0074 | 0.0074 | 0 |
| 94 | 99 | 99 | 0.0027 | 0.0027 | 0 |
| 95 | 94 | 94 | 0.0074 | 0.0074 | 0 |
| 95 | 100 | 100 | 0.0054 | 0.0054 | 0 |
| 96 | 93 | 93 | 0.0069 | 0.0069 | 0 |
| 96 | 98 | 98 | 0.0056 | 0.0056 | 0 |
| 97 | 91 | 91 | 0.0112 | 0.0112 | 0 |
| 97 | 92 | 92 | 0.0016 | 0.0016 | 0 |
| 98 | 95 | 95 | 0.0057 | 0.0057 | 0 |
| 98 | 100 | 100 | 0.0071 | 0.0071 | 0 |
| 99 | 92 | 92 | 0.0019 | 0.0019 | 0 |
| 99 | 97 | 97 | 0.0094 | 0.0094 | 0 |
| 100 | 94 | 94 | 0.0027 | 0.0027 | 0 |
| 100 | 99 | 99 | 0.0094 | 0.0094 | 0 |
Reconciliation plots of transition probability without self-state
The transition probability of one state moving to the next state is calculated using manual calculations and the markovchain function and plotted for comparison. From all the next state transitions, the one with a higher probability is selected.
reconcile_plots[[3]]Reconciliation table of transition probability without self-state
displayTable(reconcile_plots[[4]], limit = 217)| Current_State | Next_State_manual | Next_State_markov | Probability_manual_calculation | Probability_markov_function | diff |
|---|---|---|---|---|---|
| 1 | 2 | 2 | 0.0127 | 0.0127 | 0 |
| 1 | 7 | 7 | 0.0017 | 0.0017 | 0 |
| 2 | 5 | 5 | 0.0114 | 0.0114 | 0 |
| 2 | 10 | 10 | 0.0021 | 0.0021 | 0 |
| 3 | 1 | 1 | 0.0097 | 0.0097 | 0 |
| 3 | 7 | 7 | 0.0061 | 0.0061 | 0 |
| 4 | 3 | 3 | 0.0131 | 0.0131 | 0 |
| 4 | 9 | 9 | 0.0024 | 0.0024 | 0 |
| 5 | 10 | 10 | 0.0008 | 0.0008 | 0 |
| 5 | 11 | 11 | 0.0116 | 0.0116 | 0 |
| 6 | 4 | 4 | 0.0144 | 0.0144 | 0 |
| 6 | 14 | 14 | 0.0006 | 0.0006 | 0 |
| 7 | 1 | 1 | 0.0006 | 0.0006 | 0 |
| 7 | 2 | 2 | 0.0023 | 0.0023 | 0 |
| 7 | 10 | 10 | 0.0053 | 0.0053 | 0 |
| 7 | 15 | 15 | 0.0012 | 0.0012 | 0 |
| 8 | 6 | 6 | 0.0148 | 0.0148 | 0 |
| 8 | 13 | 13 | 0.0009 | 0.0009 | 0 |
| 9 | 3 | 3 | 0.0026 | 0.0026 | 0 |
| 9 | 7 | 7 | 0.0026 | 0.0026 | 0 |
| 9 | 15 | 15 | 0.0039 | 0.0039 | 0 |
| 10 | 11 | 11 | 0.0021 | 0.0021 | 0 |
| 10 | 19 | 19 | 0.0053 | 0.0053 | 0 |
| 10 | 20 | 20 | 0.0005 | 0.0005 | 0 |
| 11 | 19 | 19 | 0.0032 | 0.0032 | 0 |
| 11 | 21 | 21 | 0.0089 | 0.0089 | 0 |
| 12 | 8 | 8 | 0.0100 | 0.0100 | 0 |
| 12 | 13 | 13 | 0.0036 | 0.0036 | 0 |
| 13 | 6 | 6 | 0.0049 | 0.0049 | 0 |
| 13 | 8 | 8 | 0.0025 | 0.0025 | 0 |
| 13 | 14 | 14 | 0.0031 | 0.0031 | 0 |
| 13 | 18 | 18 | 0.0006 | 0.0006 | 0 |
| 14 | 4 | 4 | 0.0013 | 0.0013 | 0 |
| 14 | 9 | 9 | 0.0065 | 0.0065 | 0 |
| 15 | 10 | 10 | 0.0017 | 0.0017 | 0 |
| 15 | 20 | 20 | 0.0050 | 0.0050 | 0 |
| 16 | 12 | 12 | 0.0102 | 0.0102 | 0 |
| 16 | 17 | 17 | 0.0007 | 0.0007 | 0 |
| 17 | 12 | 12 | 0.0023 | 0.0023 | 0 |
| 17 | 13 | 13 | 0.0040 | 0.0040 | 0 |
| 17 | 18 | 18 | 0.0023 | 0.0023 | 0 |
| 18 | 13 | 13 | 0.0038 | 0.0038 | 0 |
| 18 | 14 | 14 | 0.0046 | 0.0046 | 0 |
| 19 | 21 | 21 | 0.0022 | 0.0022 | 0 |
| 19 | 26 | 26 | 0.0033 | 0.0033 | 0 |
| 19 | 27 | 27 | 0.0033 | 0.0033 | 0 |
| 20 | 19 | 19 | 0.0008 | 0.0008 | 0 |
| 20 | 26 | 26 | 0.0051 | 0.0051 | 0 |
| 21 | 27 | 27 | 0.0124 | 0.0124 | 0 |
| 22 | 16 | 16 | 0.0102 | 0.0102 | 0 |
| 22 | 23 | 23 | 0.0006 | 0.0006 | 0 |
| 23 | 17 | 17 | 0.0073 | 0.0073 | 0 |
| 23 | 22 | 22 | 0.0005 | 0.0005 | 0 |
| 24 | 18 | 18 | 0.0043 | 0.0043 | 0 |
| 24 | 23 | 23 | 0.0007 | 0.0007 | 0 |
| 25 | 22 | 22 | 0.0101 | 0.0101 | 0 |
| 25 | 28 | 28 | 0.0007 | 0.0007 | 0 |
| 26 | 27 | 27 | 0.0006 | 0.0006 | 0 |
| 26 | 32 | 32 | 0.0056 | 0.0056 | 0 |
| 26 | 33 | 33 | 0.0011 | 0.0011 | 0 |
| 27 | 26 | 26 | 0.0004 | 0.0004 | 0 |
| 27 | 33 | 33 | 0.0106 | 0.0106 | 0 |
| 28 | 22 | 22 | 0.0005 | 0.0005 | 0 |
| 28 | 23 | 23 | 0.0047 | 0.0047 | 0 |
| 28 | 25 | 25 | 0.0028 | 0.0028 | 0 |
| 29 | 23 | 23 | 0.0018 | 0.0018 | 0 |
| 29 | 24 | 24 | 0.0043 | 0.0043 | 0 |
| 30 | 25 | 25 | 0.0052 | 0.0052 | 0 |
| 30 | 28 | 28 | 0.0042 | 0.0042 | 0 |
| 31 | 29 | 29 | 0.0034 | 0.0034 | 0 |
| 32 | 37 | 37 | 0.0027 | 0.0027 | 0 |
| 32 | 40 | 40 | 0.0048 | 0.0048 | 0 |
| 33 | 32 | 32 | 0.0016 | 0.0016 | 0 |
| 33 | 40 | 40 | 0.0012 | 0.0012 | 0 |
| 33 | 44 | 44 | 0.0077 | 0.0077 | 0 |
| 34 | 28 | 28 | 0.0032 | 0.0032 | 0 |
| 34 | 30 | 30 | 0.0032 | 0.0032 | 0 |
| 35 | 30 | 30 | 0.0050 | 0.0050 | 0 |
| 35 | 34 | 34 | 0.0025 | 0.0025 | 0 |
| 36 | 29 | 29 | 0.0031 | 0.0031 | 0 |
| 36 | 31 | 31 | 0.0006 | 0.0006 | 0 |
| 36 | 34 | 34 | 0.0013 | 0.0013 | 0 |
| 37 | 39 | 39 | 0.0035 | 0.0035 | 0 |
| 37 | 48 | 48 | 0.0017 | 0.0017 | 0 |
| 38 | 31 | 31 | 0.0011 | 0.0011 | 0 |
| 38 | 36 | 36 | 0.0033 | 0.0033 | 0 |
| 39 | 31 | 31 | 0.0010 | 0.0010 | 0 |
| 39 | 38 | 38 | 0.0019 | 0.0019 | 0 |
| 39 | 46 | 46 | 0.0014 | 0.0014 | 0 |
| 40 | 37 | 37 | 0.0018 | 0.0018 | 0 |
| 40 | 48 | 48 | 0.0036 | 0.0036 | 0 |
| 40 | 56 | 56 | 0.0036 | 0.0036 | 0 |
| 41 | 34 | 34 | 0.0020 | 0.0020 | 0 |
| 41 | 35 | 35 | 0.0024 | 0.0024 | 0 |
| 42 | 35 | 35 | 0.0050 | 0.0050 | 0 |
| 43 | 34 | 34 | 0.0014 | 0.0014 | 0 |
| 43 | 36 | 36 | 0.0007 | 0.0007 | 0 |
| 43 | 41 | 41 | 0.0021 | 0.0021 | 0 |
| 44 | 40 | 40 | 0.0042 | 0.0042 | 0 |
| 44 | 56 | 56 | 0.0058 | 0.0058 | 0 |
| 45 | 39 | 39 | 0.0024 | 0.0024 | 0 |
| 45 | 53 | 53 | 0.0024 | 0.0024 | 0 |
| 46 | 38 | 38 | 0.0019 | 0.0019 | 0 |
| 46 | 47 | 47 | 0.0033 | 0.0033 | 0 |
| 47 | 43 | 43 | 0.0034 | 0.0034 | 0 |
| 47 | 52 | 52 | 0.0010 | 0.0010 | 0 |
| 48 | 50 | 50 | 0.0059 | 0.0059 | 0 |
| 48 | 58 | 58 | 0.0017 | 0.0017 | 0 |
| 49 | 42 | 42 | 0.0040 | 0.0040 | 0 |
| 50 | 46 | 46 | 0.0015 | 0.0015 | 0 |
| 50 | 54 | 54 | 0.0037 | 0.0037 | 0 |
| 50 | 62 | 62 | 0.0011 | 0.0011 | 0 |
| 51 | 41 | 41 | 0.0016 | 0.0016 | 0 |
| 51 | 49 | 49 | 0.0030 | 0.0030 | 0 |
| 52 | 43 | 43 | 0.0006 | 0.0006 | 0 |
| 52 | 51 | 51 | 0.0042 | 0.0042 | 0 |
| 52 | 63 | 63 | 0.0009 | 0.0009 | 0 |
| 53 | 39 | 39 | 0.0003 | 0.0003 | 0 |
| 53 | 46 | 46 | 0.0013 | 0.0013 | 0 |
| 53 | 54 | 54 | 0.0036 | 0.0036 | 0 |
| 54 | 47 | 47 | 0.0008 | 0.0008 | 0 |
| 54 | 55 | 55 | 0.0045 | 0.0045 | 0 |
| 54 | 60 | 60 | 0.0011 | 0.0011 | 0 |
| 55 | 47 | 47 | 0.0009 | 0.0009 | 0 |
| 55 | 52 | 52 | 0.0040 | 0.0040 | 0 |
| 55 | 64 | 64 | 0.0009 | 0.0009 | 0 |
| 56 | 48 | 48 | 0.0031 | 0.0031 | 0 |
| 56 | 58 | 58 | 0.0054 | 0.0054 | 0 |
| 57 | 45 | 45 | 0.0008 | 0.0008 | 0 |
| 57 | 53 | 53 | 0.0056 | 0.0056 | 0 |
| 58 | 50 | 50 | 0.0012 | 0.0012 | 0 |
| 58 | 62 | 62 | 0.0050 | 0.0050 | 0 |
| 59 | 45 | 45 | 0.0009 | 0.0009 | 0 |
| 59 | 57 | 57 | 0.0066 | 0.0066 | 0 |
| 59 | 65 | 65 | 0.0009 | 0.0009 | 0 |
| 60 | 55 | 55 | 0.0005 | 0.0005 | 0 |
| 60 | 64 | 64 | 0.0043 | 0.0043 | 0 |
| 61 | 60 | 60 | 0.0036 | 0.0036 | 0 |
| 61 | 66 | 66 | 0.0008 | 0.0008 | 0 |
| 62 | 54 | 54 | 0.0007 | 0.0007 | 0 |
| 62 | 60 | 60 | 0.0017 | 0.0017 | 0 |
| 62 | 66 | 66 | 0.0030 | 0.0030 | 0 |
| 63 | 67 | 67 | 0.0011 | 0.0011 | 0 |
| 63 | 68 | 68 | 0.0004 | 0.0004 | 0 |
| 63 | 72 | 72 | 0.0008 | 0.0008 | 0 |
| 64 | 52 | 52 | 0.0007 | 0.0007 | 0 |
| 64 | 63 | 63 | 0.0004 | 0.0004 | 0 |
| 64 | 68 | 68 | 0.0031 | 0.0031 | 0 |
| 65 | 57 | 57 | 0.0009 | 0.0009 | 0 |
| 65 | 61 | 61 | 0.0051 | 0.0051 | 0 |
| 66 | 71 | 71 | 0.0040 | 0.0040 | 0 |
| 67 | 72 | 72 | 0.0006 | 0.0006 | 0 |
| 67 | 76 | 76 | 0.0011 | 0.0011 | 0 |
| 68 | 63 | 63 | 0.0003 | 0.0003 | 0 |
| 68 | 72 | 72 | 0.0026 | 0.0026 | 0 |
| 68 | 73 | 73 | 0.0010 | 0.0010 | 0 |
| 69 | 59 | 59 | 0.0053 | 0.0053 | 0 |
| 69 | 70 | 70 | 0.0043 | 0.0043 | 0 |
| 70 | 59 | 59 | 0.0036 | 0.0036 | 0 |
| 70 | 65 | 65 | 0.0050 | 0.0050 | 0 |
| 71 | 74 | 74 | 0.0039 | 0.0039 | 0 |
| 72 | 73 | 73 | 0.0033 | 0.0033 | 0 |
| 72 | 76 | 76 | 0.0008 | 0.0008 | 0 |
| 72 | 77 | 77 | 0.0012 | 0.0012 | 0 |
| 73 | 77 | 77 | 0.0033 | 0.0033 | 0 |
| 73 | 80 | 80 | 0.0016 | 0.0016 | 0 |
| 74 | 79 | 79 | 0.0041 | 0.0041 | 0 |
| 75 | 69 | 69 | 0.0055 | 0.0055 | 0 |
| 75 | 70 | 70 | 0.0055 | 0.0055 | 0 |
| 76 | 77 | 77 | 0.0029 | 0.0029 | 0 |
| 76 | 82 | 82 | 0.0010 | 0.0010 | 0 |
| 77 | 80 | 80 | 0.0050 | 0.0050 | 0 |
| 77 | 82 | 82 | 0.0020 | 0.0020 | 0 |
| 78 | 69 | 69 | 0.0040 | 0.0040 | 0 |
| 78 | 75 | 75 | 0.0063 | 0.0063 | 0 |
| 79 | 83 | 83 | 0.0050 | 0.0050 | 0 |
| 80 | 84 | 84 | 0.0057 | 0.0057 | 0 |
| 81 | 75 | 75 | 0.0061 | 0.0061 | 0 |
| 81 | 78 | 78 | 0.0056 | 0.0056 | 0 |
| 82 | 84 | 84 | 0.0047 | 0.0047 | 0 |
| 83 | 86 | 86 | 0.0058 | 0.0058 | 0 |
| 83 | 88 | 88 | 0.0016 | 0.0016 | 0 |
| 84 | 83 | 83 | 0.0014 | 0.0014 | 0 |
| 84 | 88 | 88 | 0.0072 | 0.0072 | 0 |
| 85 | 78 | 78 | 0.0047 | 0.0047 | 0 |
| 85 | 81 | 81 | 0.0058 | 0.0058 | 0 |
| 86 | 89 | 89 | 0.0062 | 0.0062 | 0 |
| 86 | 90 | 90 | 0.0040 | 0.0040 | 0 |
| 87 | 81 | 81 | 0.0067 | 0.0067 | 0 |
| 87 | 85 | 85 | 0.0037 | 0.0037 | 0 |
| 88 | 86 | 86 | 0.0037 | 0.0037 | 0 |
| 88 | 90 | 90 | 0.0059 | 0.0059 | 0 |
| 89 | 93 | 93 | 0.0068 | 0.0068 | 0 |
| 89 | 96 | 96 | 0.0049 | 0.0049 | 0 |
| 90 | 89 | 89 | 0.0049 | 0.0049 | 0 |
| 90 | 96 | 96 | 0.0061 | 0.0061 | 0 |
| 91 | 85 | 85 | 0.0074 | 0.0074 | 0 |
| 91 | 87 | 87 | 0.0037 | 0.0037 | 0 |
| 92 | 87 | 87 | 0.0066 | 0.0066 | 0 |
| 92 | 91 | 91 | 0.0024 | 0.0024 | 0 |
| 92 | 97 | 97 | 0.0006 | 0.0006 | 0 |
| 93 | 95 | 95 | 0.0069 | 0.0069 | 0 |
| 93 | 98 | 98 | 0.0063 | 0.0063 | 0 |
| 94 | 92 | 92 | 0.0074 | 0.0074 | 0 |
| 94 | 99 | 99 | 0.0027 | 0.0027 | 0 |
| 95 | 94 | 94 | 0.0074 | 0.0074 | 0 |
| 95 | 100 | 100 | 0.0054 | 0.0054 | 0 |
| 96 | 93 | 93 | 0.0069 | 0.0069 | 0 |
| 96 | 98 | 98 | 0.0056 | 0.0056 | 0 |
| 97 | 91 | 91 | 0.0112 | 0.0112 | 0 |
| 97 | 92 | 92 | 0.0016 | 0.0016 | 0 |
| 98 | 95 | 95 | 0.0057 | 0.0057 | 0 |
| 98 | 100 | 100 | 0.0071 | 0.0071 | 0 |
| 99 | 92 | 92 | 0.0019 | 0.0019 | 0 |
| 99 | 97 | 97 | 0.0094 | 0.0094 | 0 |
| 100 | 94 | 94 | 0.0027 | 0.0027 | 0 |
| 100 | 99 | 99 | 0.0094 | 0.0094 | 0 |
plotAnimatedFlowmap(
hvt_model_output,
transition_probability_df,
df,
animation = "All",
flow_map = "All",
fps_state,
fps_time,
time_duration,
state_duration,
cellid_column,
time_column )hvt_model_output - The list object,
which is the output from the trainHVT function.
transition_probability_df - The
probability list, which is the output from the
getTransitionProbability function.
df - A data frame contains Cell ID
and Time stamps.
animation - A character string, if
set to ‘time_based’, an animation in which a red dot moves along cells
according to the timestamp will be displayed. If set to ‘state_based’,
an arrow animation based on the highest state excluding self-state will
be displayed. If set to ‘All’, both the animation will be displayed. If
set to NULL none will be displayed.
flow_map - A character string. If
set to ‘self_state’, the plot which shows the self-state by circles will
be displayed. More the circle size, more probability of the cell stays
in the same cell. If set to ‘without_self_state’, the plot which shows
the next state by arrows will be displayed. The arrow head points the
next cell to go from the cell which is pointed by the arrow tail. If set
to ‘All’, two flow maps will be displayed. If set to NULL none will be
displayed.
fps_time - A numeric value
indicating the frames per second for time transition animation. (Must be
a numeric value and a factor of 100). Default value is 1.
fps_state - A numeric value
indicating the frames per second for state transition animation. (Must
be numeric value and a factor of 100). Default value is 1.
time_duration - A numeric value
indicating the duration of the gif for time transition animation.
Default value is 2.
state_duration - A numeric value
indicating the duration of the gif for state transition animation.
Default value is 2.
cellid_column - A character string
specifying the column name of the Cell ID.
time_column - A character string
specifying the column name of the Time stamp.
flowmap_plots <- plotAnimatedFlowmap(hvt_model_output = hvt.results,
transition_probability_df =trans_table,
df = temporal_data,
animation = NULL , flow_map = 'All',
fps_time = 30, fps_state = 5,
time_duration = 180,state_duration = 20,
cellid_column = "Cell.ID", time_column = "t")
#> [1] "'animation' argument is NULL"1. Flow map: Highest transition probability including self-state
The Circle size around the cell’s centroid represents self-state probability. More size, more probability of staying in the same cell.
flowmap_plots[[1]]2. Flow map: Highest transition probability excluding self-states: Arrow size represents transition probability
The arrow size represents the Probability of the data to move to the next state. And the arrow directions point out to which cell it is moving next.
flowmap_plots[[2]]3. Flow map animation: Highest state transition probabilities including self-state
The red point moves through the cells as per the time stamp and blinks mean that its stays in the same cell for a particular amount of time which can be referred from the sub-header in the gif.
#flowmap_plots[[3]] 4. Flow map animation: Highest state transition probabilities excluding self-states
The arrow moves from the current cell to the next cell and its length refers to the transitioning probabilities of the next state without including self state.
#flowmap_plots[[4]]